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FOREWORD 


The  Machine  Translation  group  at  Thompson  Ramo  Wooldridge 
Inc.  has  been  working  under  the  sponsorship  of  the  Intelligence  Labo- 
ratory  of  the  Rome  Air  Development  Center,  Griffiss  Air  Force  Base, 
since  1959.  This  research  continues  work  done  under  previous  con¬ 
tracts  with  RADC. 

During  the  course  of  the  present  research  a  concurrent  contract 
has  been  in  effect  with  the  National  Science  Foundation.  Some  of  the 
tools  used  in  the  present  research  were  created  under  NSF  support. 

In  general,  studies  in  the  area  of  Semantics  have  been  done  under 
contract  with  RADC,  while  work  to  improve  the  techniques  for  re¬ 
search  in  MT  has  been  done  under  contract  to  NSF. 

The  support  of  the  Intelligence  Laboratory  and  its  members 
at  the  Rome  Air  Development  Center  is  hereby  gratefully  acknowledged. 


This  report  represents  the  work  of 

Loretta  Ertel 
Herbert  L.  Holley 
Paul  L.  Garvin 

Jules  Merael  (Project  Manager) 

Christine  A.  Montgomery 
George  Onischenko 
Gerhard  Reitz 

Steven  B.  Smith  (Associate  Project  Manager) 


iii 


ABSTRACT 


A  new  semantic  research  technique  has  been  developed  and 
employed  under  this  contract.  This  technique  makes  possible  the 
construction  of  semantic  classes  of  words  which  share  the  characte 
istic  of  specifying  a  particular  translation  for  a  given  polysemantic 
Russian  content  word. 

Improvements  have  been  made  in  the  RW  program  for  the 
areas  of  subject  recognition  and  clause  boundary  determination. 

Conclusions  were  reached  about  an  improved  English 
synthesis.  The  latter  involves  reformulating  Russian  structures 
into  appropriate  but  non-corresponding  English  structures. 

The  major  hardware  implication  of  the  research  suggests 
that  associative  memories  may  provide  a  simplification  in  the  area 
of  machine  translation. 

Flow  charts  and  a  sample  translation  are  included. 
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INTRODUCTION 


The  first  section  of  the  report  describes  the  semantic  research 
technique  which  was  developed  under  this  contract.  This  technique 
is  dependent  upon  our  capability  for  automatic  sentence  parsing  by 
the  Fulcrum  approach. 

There  were  two  goals  in  this  research;  first  to  define  rules 
for  a  particular  type  of  multiple  meaning  problem  that  had  hereto¬ 
fore  been  either  ignored  or  considered  impossible  to  solve;  second, 
to  determine  what  light  such  rules  might  shed  on  the  problem  of 
semantics  in  general. 

Both  goals  were  achieved;  the  second  in  particular  displayed 
certain  consistencies  in  the  Russian  semantic  system  which  are 
remarkable. 

In  our  research,  semantics  means  the  ability  to  choose  the 
correct  translation  from  among  those  provided  in  the  dictionary. 

Our  studies  have  shown  that  the  other  text  words  which  determine 
this  selection  often  fall  into  groupB  which  themselves  seem  con¬ 
ceptually  related.  There  has  always  been  a  hope  among  researchers 
that  such  semantic  classes  could  be  defined.  We  feel  that  our  tech¬ 
nique  allows  us  to  uncover  the  identity  of  these  classes  and,  even 
possibly,  may  provide  a  metric  for  the  semantic  distance  function. 
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SEMANTIC  RESEARCH 


The  first  step  in  our  research  was  to  select  Russian  text  and 
acquire  corresponding  English  translations  of  that  text.  The  Russian 
text  was  selected  from  the  field  of  biology  and  was  taken  from  the 
Doklady  Akademii  Nauk.  All  the  text  was  chosen  from  the  same 
field  because  we  were  interested  in  problems  of  multiple  meaning 
within  a  single  field,  rather  than  differentiations  in  translations 
which  were  field-defined. 


On  the  opposite  page  we  show  p.  564 
of  Vol.  123  of  the  Doklady  for  biology. 
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AonaiAU  AaaneMNN  Maya  CCCP 

IM0.  Tom  IZ3,  .V  3 


MOJiornu 

C.  A.  MHJIEAKOBCKHA 

JiyHHAfl  nEPHOAMMHOCTb  HEPECTA  y  JIHTOPAJIbHUX 
H  BEPXHEOBJlHTOPAJlbHblX  BECn03B0H0HHUX 
BEJIOTO  MOPS  M  flPVTMX  MOPE  ft 

( 11  pedcmatMHo  anode  uukom  H.  H.  UlMnAbtayatuoM  8  VIII  1958) 

/lyHHa*  nepiioflUMHocrb  paaMHoaceHii*  ii  Hepccra,  cBoAcTBeHHax  mhothm 
BIIABM  AHtOpaAbHbiX  OcDOaaOHOMHbIX  TpOtlHMKKHX  H  fopeaAbHUX  liopcfl  ('*), 
Ana  4»yHW  auTOpa/iH  apxTHiecKHX  MopeA  ao  chx  nop  b  AHtepaType  He  otmc- 
■uuiacb. 

HaaimHe  ayHHoA  wphoahmhocth  b  pasMuoxcenHu  xiiTopajibHUx  decno- 
3BOHOMHUK  Benoro  Mopn  0uno  ycTaHoaneiio  hbmh  nyTCM  HayxeHHx  AHHaMHKH 
■uic/ifhhocth  b  naaiiKTOHe  hx  nenarHMecKiix  ahihhok;  xocTOBcpHocrb  ace  pc- 
syabTaToa  odpaOortcH  xoAUMecTBeHHux  npoO  Mopcxoro  xoxo-  it  MeponaaHXTOHi 

AOKB3aHa  CTaTHCTHMCCKH  (’*'*). 

MaTepHBA  cofliipajica  b  BcahkoA  Ciamc  —  npoxHBe  Mexcxy  n-o.  Khhao 
n  o.  BenmcnA  (Bexoe  Mope,  KaiuuuiaKUiCKiift  3&ahb).  C  26  VI  no  14  IX  1957  r. 
Obuto  33  a  to  58  KOAHiecTBeHHbix  npo6  iuibhktohb.  FlpoOw  6pa.mcb  paaAeAbno 
c  ropH30HToa  16—6  h  8—0  m  ceTbio  AweAH  ns  rasa  N»  43  c  saMuxaTexeM.  Hay- 
iaaacb  ahhamiiki  mtcxenHocTH  acex  xhmiuiok  aohhux  Occiiosoohoxhux,  a  Tax- 
ace  Una  cpasxeiiiix)  pnaa  nocroxHHux  nxaHxrcpoa  h  hx  xhhiihox.  Bee  npo6w, 
^WKCHpoaaiiHue  4%  cpopuaxHHOM,  Oh.ih  npocMHTBHu  TOTaxbHo.  Jinn  noerpoe- 
hhn  rpa^HCKos  Hcnoaibaoeaaiacb  seaiHmiHa  naioTHOCTH  abhhoto  bhab  b  I  m* 
b  caioe  16—0  m,  tbk  xax  MCHxeHiiocrb  scex  iisyiaoMMx  biiaob  H3MCHxxacb  a  060- 
hx  ropHaoHTax  CHHxpoHHo.  KpoMe  toto  bo  Bcex  rpactmxax  SMnHpHieCKHe  HC* 
paBHOMepHUe  pham  (npoChi  Opaxiicb  Macro,  b  Htoxe  Mepes  aenb  —  abb,  b  aery* 
ere  —  ceHTXfipe  Mepes  ABa-MeTupe  ahx.  iio  He  cobccm  pcryxxpHO,  mbbhum 
oOpaaou  H3-3B  noroAbi)  nepeseAeMu  XHHeAiioft  HHTepnoAauneft  b  paBHOMep- 
Hbte  —  rpexAHeBiiue.  3ro,  a  Taxace  bsxthc  ncex  npo6  b  oxny  h  Ty  ace  $asy 
npoaiHsa  (hx  noxnoA  boas)  h  Ha  oahom  h  tom  ace  Mccre  B.  Cbamm  nosBoxxxo 
a  3Ha<(4renbHoft  CTeneHH  cHBTb  sxeMenT  cxyqaAHocrH.  Tax  xax  MaxcHMaxbxa* 
rayOHHB  B.  Cxamu  oxoxo  25  m,  a  npeo&naaaiOT  rxyOuHu  ao  20  m,  to  xhmhhkh 
AOhhmx  OernosBOHOMHUx,  BcrpeMaBumeca  b  ce  nxauxTOHe,  ace  npHHBAaeacaaH 
X  AIITOpaAbHUM  H  BepXHeCyfiAIITOpBJIbHbIM  BHAaM. 

Kax  noxa38A  8H8AH3,  AHHaMHKe  micxeiiiiocTH  faxbuieA  mbctii  xhmiihok  aoh¬ 
hux  6acno380HOMHUx  H3  nxaHKTOHa  B.  CaxMu  npHcym  exHHwrt  aaxoHOMepxuA 
Phtm,  orpaxca  x>3(ii  A  ayHHyx)  nepHOAHMHOCTb  HepecTa  abhhux  bhaob  (pxc.  1  a  — «). 
JlyxHaa  nepHOAHMHocrb  H3pecra  csoAcrBexHa,  no  HauiiiM  abhhum,  cxeayio- 
mHM  0ecno3BoiioM hum  B.  CaxMu:  6p»oxoHoniM  Lacuna  divaricata  (O.  Fabr.), 
Littorina  littorea  L.;  Oiaphana  minuta  Brown,  Ptiiline  aperta  L.,  Limapontia 
capitata  MAIL,  Eubranchui  exiguus  (A.  a.  H.),  Tergipes  despcctus  Johnston, 
a  Taxace,  aepoxnra,  ewe  pxxy  bhaob  hs  otpxab  roxoxcaOepHux  (*);  okb  npxcy- 
ma  pxxy  ABycrsopMarux  —  Mytllus  cdulls  L.,  Macoma  baltica  L.,  Mya  are- 
nana  L.;  HexoropuM  ycoHorxM  —  Balanus  balanus  L„  Verruca  atrOmia  O.  F. 
MOIIer;  pxxy  muuhok  h  HrxoxoacHX  Ophiopholis  aculeata  L„  Ophiura  robusta 
Ayers,  Aaterfas  rubena  L.  (cm.  phc.  I),  a  taxace,  oMeaxAHo,  cyxx  no  cyMMapnol 
AMHBMHKe  MHcneKKOcTX,  x  pxxy  bhaob  Polychaeta. 
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The  next  problem  was  that  of  deciding  which  words  to  deal 
with.  In  the  past,  a  good  deal  of  semantic  research  has  been 
devoted  to  problems  of  multiple  meaning  in  connection  with  function 
words  (such  as  prepositions,  particles,  conjunctions),  while  — 
except  for  a  few  instances  of  idioms  —  the  problem  of  the  multiple 
meaning  of  content  words  (i.  e. ,  nouns,  verbs,  etc. )  was  ignored. 
We  therefore  determined  to  fix  our  primary  attention  on  content 
words. 


On  the  opposite  page,  we  give  a 
sample  of  the  content  words  investigated 
along  with  their  multiple  translations. 
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CBXSK 


A. 

in  the  process  of 

I. 

B. 

connection^ ) 

J. 

C. 

on  account  of 

K. 

D. 

in  view  of 

L. 

E. 

relation 

M. 

F. 

in  response  to 

N. 

G. 

H. 

accordingly 

because  of 

0. 

AGLXbHeitMM 


A. 

then 

F. 

B. 

now 

G. 

C. 

subsequently 

H. 

D. 

subsequent 

I. 

E. 

prolonged 

pas HUH 

A. 

various 

D. 

B. 

different 

E. 

C. 

variation  in  the 

F. 

TeueHae 

A. 

for 

C. 

B. 

course 

D. 

CTOpOHa 

A. 

side 

E. 

B. 

toward 

F. 

C. 

direction 

G. 

D. 

past 

cxyuanx 

A. 

case 

C. 

B. 

occasions 

D. 
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relating  to 
owing  to 
due  to 

0 

relationship 

related 

thus 


ultimately 

further 

later 

henceforth 


differing 

varying 

differences  in  the 

over  the  course 

0 

on  the  other  hand 
on  one  hand 
laterally 


events 

incidents 


Next  it  wti  necessary  to  find  out  which  content  word*  had 
multiple  meaning  problems  within  a  single  field,  and  occurred 
with  sufficient  frequency  to  allow  us  to  make  some  generalisations 
about  their  behavior.  For  this  purpose  a  program  was  written  to 
give  us  a  frequency  count  of  all  words  occurring  within  a  large 
body  of  text,  and  print  out  the  individual  Russian  forms  in  the 
order  of  their  frequency  of  occurrence.  We  thereupon  decided 
to  investigate  all  those  content  words  with  multiple  equivalents 
that  had  one  form  which  occurred  20  or  more  times,  as  well  as 
certain  other  words  of  particular  interest  to  us. 


The  following  is  a  sample  of  the  output 
of  the  word  frequency  program.  The  words 
are  listed  in  order  of  frequency  and  the  num¬ 
ber  to  the  far  left  indicates  the  position  the 
word  occupies  in  the  list.  The  first  group 
of  numbers  to  the  right  of  a  word  indicates 
the  number  of  times  that  word  occurred. 

The  next  group  of  numbers  indicates  what 
percentage  of  the  total  number  of  occurrences 
of  all  words  is  represented  by  this  word  and 
all  words  prior  to  it  in  the  list.  For  example, 
this  provides  us  with  the  information  that  the 
77  most  frequent  words  (out  of  a  total  of 
37859  running  words)  represent  one  third  of 
the  occurrences  in  this  text.  The  next  group 
indicates  how  much  of  the  total  text  is  repre¬ 
sented  by  occurrences  of  this  word.  The  next 
to  last  group  is  the  position  of  the  word  in  the 
list  multiplied  by  its  frequency  of  occurrence. 
(This  number,  according  to  Zipf's  law,  should 
approximate  0.  1.  Our  "Zipf"  number,  however, 
does  not  take  into  account  that  many  numbers, 
in  reality,  occupy  the  same  position  in  the 
list.)  The  last  group  of  numbers  (here  zeros) 
has  as  yet  no  significance. 


*  * 

v-» 

* 


OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO 

OOOOOOOOOOOOOOOOOOOOUOOCJUOOUOOOOOOO 

UOOO'-’OOOOOOOOOOOOOOVJOOOUOOOOOOOOOOO 

UOOOOOOOOOOOOOO^OOOCJOOOOOOOUOOOUOOO 

OUOOUOOOUOOCOOOOUCO'JUOCUUOUUUOOUOOO 

a,aM,irja»<N>o(ocuO(MK,“N^r-N(r)»-ro'<)vn^M»-oo<NjOrof0^^^a' 

m'CNj«»OKin»»iNa^oiPO.ACFOK-»/iN-in(ro'*>^^"00'Ni/>»oo 

aj(uoDoopg»ofoi»ir-ocv-»fo^^’OfncNjjnr'^oN0uoDO‘Off'®0'HM»-o 

mi«)Oor»N^iMyK«4«i-«-o'Oti«-M»-flO(nca><c««mooooco^ 

OOOOOCOOOOOOOOOOOOOCJOOOOOOOOOOOOOOO 

'OOOOOOOOUOOOOOOOOOOOOOOOOOOOOOOOOOO 


>5v 

v \" 
“v  %  • 
5  % 

O  V 

CL  /*•  -o 

«:  v«<i  -o 


>-pr>ln>ocoO'-fOinocD(T>r-(Njrkri^cc'^’-<Nj»moo"soc>j»orf^>h-ooO' 
mmn  mmm  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi  hi 


l<ljrw»HIHIHI<N-ia'®mC0H-K.O<!^HIHlHI<M<lJ(V«MH-—  ->O0>t>0‘0'C0f«- 

cccO'OC'CCuiiiumfwiJii/ii/'.ii/ii/ii/wuni/'i/ijiifW'irjjjjjj 


\>s 


C  H  * 

-»  o  UJ  •—  «"• 

>-  C*  >  «%  UJ  i!  < 

>zr  5!  <  *-  UJ  X 

■tfO>>  >  O  «  7*  ?  / 


U  JUU 
O  >  -i  -J 
U  <  XA  O 

*  a.  D  o  H 


u  -£  o  u*  ’jj  uj  —  u  <  u  u  \d  *r  >  k  j  ► 

.J  •*  H  —  k.CJ-JCJX  —  -J  »  *J  U  —  ’J  n  *-  a*  « 

i/>  U  <  Ul  Kl  UJ  W>  «  UJ  H  >  <  h-  jr  go  «t  H  O  O  k 


.L 

W  *t  « 

x  —  7  *  -• 

>4*  —  M  Z 

3  <  Nl  H  '.*> 

H  e*  U  ~  -J 

<  u  u  o  ir  >  x  2>- 


J  X 

WO  UJ  UJ 

vo  ^  a 
I  •—  >  vu 


H  O 
*-  >  -J 

uj  a  u 

J  O  CL 


NfOJ»in^NCO<>Of-<MK)jf»r-OK 


f-HODOWCDOOCDWeCO******* 


7 


BEZUSLCVNCGO 


The  next  step  wai  to  liat  the  contexts  in  which  these  forms 
occurred.  In  order  to  find  them  in  text,  another  program  was 
devised  to  print  an  alphabetical  list  of  all  the  words  that  occurred 
in  text  along  with  their  text  locations. 


The  following  is  a  sample  of  the  output 
of  the  program  to  provide  a  concordance. 

The  first  number  to  the  right  of  a  word  in¬ 
dicates  the  number  of  times  that  word 
occurred.  Following  this  for  each  occurrence 
of  the  word  five  things  are  provided.  The  first 
letter  (B  here  in  every  case)  indicates  the 
corpus  in  which  the  word  occurred,  the  second 
letter  ,the  article;  and  the  third  letter  the  page 
of  that  article.  The  first  number  indicates 
the  line  on  which  the  word  occurred  and  the 
second  the  position  on  that  line  (first  word, 
second  word,  etc.).  For  example,  AKTE 
occurred  once  in  corpus  B,  article  R,  page 
A,  line  12,  word  2. 
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For  each  word  we  thereupon  listed  twenty  or  more  occurrence* 
in  context,  along  with  the  professional  English  translation  of  the 
word  and  its  context. 

As  an  example  of  our  procedure  we  will  use  the  Russian 
word  0TH0B6HH6,  which  translates  into  English  as  relation,  regard, 
etc. 


The  following  page  shows  a  work  sheet 
of  one  of  our  researchers,  listing  text  loca¬ 
tions,  Russian  contexts  and  English  transla¬ 
tions.  The  translations  for  the  word  under 
consideration  are  underlined. 
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'M fa  a  7  ctiococTuocTZl 

jQt.'**-  con  to*/'*'  fVi  ^  ^ «  /"TL*  vu^t,  /  r» 

^  ^  r  c-  (-</•>  Uj  1 1  T~6  Uku  u  ^ 


*  <  ju  — 


1>  fr  T ^  Tipotj*  c  c  l*  x  ?«c  7o  vn^t  *j  <4  - 

(a,  .  ft*ur*yi.  u  of>&yp.tiu>  *.<*4 *  k  o  T4 u*t<b 

^  ^'te'C£*w  v  A'i  /  oC  4/*»  'f  ‘  I  A "• A/c^kT »  I  >  (•  I  ,'c*\  ^ 

as.  SuW/lir.^  /C  /-V  i'TZk.' 

2 _  >'*<, 

J  0  "V  A  T~  S  T7  Xk#  *j  Ti£>  V  £l  *<  t*  ^  S:o  A:  0  £  ~  -4  a  So 
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Since  translator ■  use  a  proliferation  of  translation*,  many 
of  which  overlap  or  are  synonymous  (see  illustration  on  front 
cover),  it  was  necessary  to  determine  the  minimal  number  of 
translations  needed  to  render  the  word  in  question  into  good 
English. 


At  the  top  of  the  following  page  we 
give  all  the  translations  of  0T80BSHM 
used  by  the  professional  translators,  and 
at  the  bottom  the  smaller  number  of  trans¬ 
lations  which  we  found  necessary  for  the 
occurrences  under  consideration.  Instances 
in  which  the  professional  translator  recast 
the  entire  sentence  have  not  been  included 
in  this  table. 
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Translations  Used  by  Professional  Translators 
ratio 

relationship 

behavior 

ration 

effects 

relation 

respect 

proportion 

point  of  view 

as  regards 

connection 

0 

response 


Necessary  Translations 

respect 
proportion 
as  regards 
relation 

0  ("ly"  added  to  translation  of  preceding  modifier) 

ratio 

response 


Once  the  preceding  two  step*  had  been  completed,  these 
contexts  were  "mapped"  in  such  a  way  that  any  consistencies  might 
be  quickly  brought  to  the  attention  of  the  researcher. 


On  the  opposite  page  we  give  the 
syntactic  "mapping"  of  the  word  OTHOHSHHO. 
The  numbers  down  the  left  hand  margin 
indicate  the  sentence  in  which  the  word 
occurred.  The  letters  down  the  right  hand 
margin  refer  to  the  translation  which  the 
professional  translator  used  in  each  case. 

The  first  column  on  the  left  gives  the 
Russian  words  of  which  this  word  was  an 
object.  The  second  column  lists  the  prep¬ 
ositions  of  which  this  word  was  an  object. 

The  third  column  gives  the  modifiers  which 
preceded  the  word  in  text.  The  fourth 
column  lists  genitive  nominals  which  followed 
the  word,  and  the  next  to  last  column  shows 
the  prepositions  which  were  governed  by  the 
word.  The  final  column  gives  the  objects  of 
the  prepositions  which  were  governed  by 
OTHOBSHHe. 
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Next  it  wai  necessary  to  diacover  which  worda  in  the  environ¬ 
ment  of  the  word  tinder  consideration  could  be  considered  deter¬ 
miners  for  the  various  translations  of  that  word. 


A  particular  set  of  determiners  is 
generally  applicable  within  a  particular 
syntactic  context.  We  list  here  the  deter¬ 
miners  which  we  found  in  text  for  0TH0MHH6, 
along  with  their  translations  and  relevant 
syntactic  contexts. 
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( 1  )  B  0TH0D6HHH  < 


annapara 

(apparatus) 

cocyxoB 

(vessels) 

cixocoGhocth 
(capacity) 

caHTeaa 

(synthesis) 

CoraTCTBa 

(wealth) 

□pOHHKHOBBBHB 
v^_  (penetration)  J 

f1  r.  ^ 

^  (1  gram)  J 


>  as  regards 


in  the  proportion  of 


(2)  0TH0H6HH— 


'Beaa^HHU  ^ 

K 

^nepBott 

(magnitude) 

(first) 

KOJIHVeCTBa 

K 

panaoHy 

1  (quantity) 

.  J 

(ration) 

\  uacaa 

>  K 

uacay  / 

(number) 

(number) 

KOJMUeCTBa 

K 

KOJiraecTBy 

(quantity)  ^ 

L  (quantity)^ 

JoTTT  L 

K 

HKAyxnHu  ' 

r*J 

(induction) 

[juiryBeicl 

jt 

Teuneparype 

l  (frogs) J 

(temperatui 

>  ratio 


(3)  no  OTHOoeHHB  K  >  with  respect  to 

(4)  B  3T0M  OTHOmeHHH  in  this  respect 

(3)  B  aop<l>OJtorituecKOii  othodbhhb^  morphologically 
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Based  on  the  lists  of  determiners,  we  attempted  to  formulate 
general  rules  for  the  correct  translation  of  the  words  which  we  had 
studied. 

The  group  which  has  as  its  translation  "ratio"  displayed  a 
certain  semantic  homogeneity.  The  one-member  group  1  r. 

(1  gram)  suggested  a  group  represented  conceptually  by  "amount.  " 
Other  groups  displayed  no  conceptual  homogeneity  and  were  sus¬ 
pected  as  being  open-ended  or  residue  classes. 


On  the  opposite  page  we  show  the 
initial  rules  which  were  formulated  for  the 
translation  of  OTBOaeme  on  the  basis  of 
our  original  corpus. 


(1)  B  OTKOMUl  (genitive  nominal  block)  e.  g. 

A.  When  genitive  nominal  is  specified  amount  (e.g.,  1  r.) 
translate  as  "in  the  proportion  of" 

B.  Otherwise,  translate  as  "as  regards" 

(  2 )  0TK0H6H  •  genitive  nominal  block  k  dative  nominal  block 

A.  When  genitive  nominal  is  a  unit  of  measurement; 

"ratio" 

B.  Otherwise,  either  "relation"  or  "ratio";  except  when 
rule  3A  applies 

(3)  0TH0O6HH—  genitive  nominal  block 

A.  When  genitive  nominal  is  animate,  "response" 

B.  Otherwise,  "relation" 

(4)  no  0TH0B6HHX)  x  "with  respect  to"  (idiom) 

(5)  B  3TOM  OTHomeHHH  "in  this  respect"  (idiom) 
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It  was  now  necessary  to  apply  the  rules  which  we  hsLve 
developed  to  fresh  text  and  amend,  augment  or  discard  our  previous 
rules. 

In  the  case  of  OTHOiexxe  for  example,  an  examination  of 
additional  text  indicated  that  our  rules  were  in  the  main  correct. 

More  words  were  added  to  our  list  of  determiners  for  the  transla¬ 
tion  "ratio.  "  They  were  AJHTexbHOCTH  (length),  SMOOTH  (height), 
and  AAHH6  (also  length).  Their  semantic  relationship  to  the  old 
set  turned  out  to  be  as  predicted.  The  expression  B  3TOM  0TH0B6HXH 
(in  this  respect),  which  we  had  originally  considered  an  idiom, 
proved  to  be  but  one  of  a  set,  which  in  the  context  B  (modifier) 
0TH0H6HHH  ( —HJCL )  determines  the  translation  as  "respect".  The 
determining  modifiers  found  were  ApyrKX  (other)  BC0X  (all). 

Here  again,  we  note  a  certain  semantic  homogeneity,  and  we  could 
extend  this  list  on  a  speculative  basis  and,  with  the  help  of  a 
dictionary,  actually  find  new  examples  (e.g.,  MHOritx  [many]  ). 

The  research  in  this  area  was  a  success.  It  has  revealed 
some  of  the  most  interesting  facets  of  the  Russian  semantic  system 
to  date.  We  feel  that  we  have  brought  MT  to  the  edge  of  even  more 
important  discoveries  in  this  area. 


CLAUSE  BOUNDARY  DETERMINATION 


During  the  contract  period  a  number  of  routines  have  been 
added  to  the  basic  syntax  program  to  improve  the  quality  of  the 
translation.  The  most  significant  of  these  is  a  syntax  pass  which 
examines  commas  and  conjunctions  whose  function  is  ambiguous 
(that  is,  conjunctions  which  may  connect  words,  phrases,  or 
clauses:  for  example,  "and"),  in  order  to  determine  whether  they 
function  as  clause  separators.  If  they  are,  they  are  marked  as 
boundaries  for  subsequent  syntactic  and  semantic  searches.  Two 
types  of  clause  boundaries  are  recognized:  (1)  those  separating 
independent  clauses  from  each  other  (can  be  commas,  or  conjunc¬ 
tions,  or  both),  and  (2)  those  which  separate  dependent  clauses 
from  the  clause  on  which  they  depend  (commas  only): 

Ex.:  PuCoHyKJieoapoTeiuu*  uoryT  HHorja  HaxoAHTbc* 

TOJIbKO  B  KA6TK&X  nyflOKOft  30HM,  a  3A6M6HTU  CpeflHeft  H 
HapyJCHOit  30HU  nOJlHOCTb®  JUHB8HH  HX.  =  ribonucleoproteins  may 
sometimes  be  found  only  in  the  cells  of  the  deep  zone,  while  elements 
of  the  middle  and  outer  zones  are  completely  devoid  of  them.  v 

Ex.:  Pa6oHyiuieonpoT8HAM,  KOTopwe  ot cy t ct ByuT  ot 

3 A6M6HT0B  CpeAHett  H  HapyXHOft  30H,  MOryT  HaXOAHTbCX  B 
KneTKax  rxyOOKOft  30HH.  =  ribonucleoproteins,  which  are  not 

present  in  elements  of  the  middle  and  upper  zones,  may  be  found 
in  cells  of  the  deep  zone. 

The  reason  for  this  distinction  is  that  in  the  first  case,  syn¬ 
tactic  and  semantic  searches  should  be  discontinued  at  the  clause 
boundary,  whereas  in  the  second,  these  searches  should  be  inter¬ 
rupted  at  the  left  boundary  of  the  dependent  clause  (which  is  not 
searched  for  syntactic  and  semantic  codes  relating  to  the  main 


21 


clause),  and  should  be  resumed  at  the  right  boundary  and  continued 
to  the  end  of  the  sentence.  This  makes  it  possible  to  identify  syn- 
tactic  and  semantic  elements  of  a  single  clause  even  when  they  are 
separated  by  long  stretches  of  dependent  or  parenthetical  material. 

The  clause  boundary  algorithm  first  searches  the  sentence 
for  a  comma  or  a  conjunction.  If  the  comma  or  conjunction  is 
between  two  predicates,  the  immediate  environment  is  searched 
to  determine  whether  a  definite  decision  can  be  made  that  the  item 
is  a  clause  boundary.  (One  example  of  this  would  be  a  comma 
preceded  by  an  unambiguous  conjunction,  such  as  ttrc)*  If  no  syn¬ 
tactic  elements  which  definitely  establish  the  item  as  a  clause 
boundary  are  present,  the  segment  between  the  two  predicates  is 
searched  for  additional  commas  or  ambiguous  conjunctions.  If 
none  are  present,  the  current  item  is  marked  as  the  clause  boundary. 
In  cases  where  an  additional  potential  clause  boundary  is  present, 
the  environments  of  both  items  are  searched  to  determine  whether 
a  definite  decision  can  be  made  that  one  or  both  items  are  not  clause 
boundaries  (for  example,  commas  or  ambiguous  conjunctions  con¬ 
necting  two  modifiers).  If  no  definite  decision  is  possible  at  this 
point,  a  further  search  for  an  additional  comma  or  conjunction  is 
made,  and  the  same  inspection  of  the  environment  is  carried  out 
on  the  new  item.  If  no  decisions  can  yet  be  reached,  a  fourth 
potential  clause  boundary  is  searched  for  and  checked  out.  Where 
no  definite  conclusions  can  be  made  at  this  point,  attempts  to  re¬ 
solve  the  boundary  of  that  particular  clause  are  abandoned  and 
processing  of  the  next  clause  is  begun. 

After  the  initial  search  of  the  clause  boundary  algorithm,  if 
the  current  comma  or  conjunction  is  not  between  two  predicates 
but  between  a  predicate  and  the  beginning  of  the  sentence,  the  seg¬ 
ment  between  the  comma  or  conjunction  and  the  predicate  is  inspected 
to  determine  whether  it  contains  a  relative  pronoun.  If  a  relative 
pronoun  is  found,  the  current  item  is  marked  as  the  preceding  clause 
boundary  and  a  search  for  the  following  clause  boundary  is  then 
initiated. 
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SUBJECT  RECOGNITION 


As  clause  boundaries  can  now  be  defined  accurately  in  all 
but  a  few  cases,  it  has  been  possible  to  make  several  improve¬ 
ments  in  the  quality  of  the  translation  by  expanding  the  existing 
subject  recognition  routine  to  handle  objects  and  by  including  a 
routine  to  identify  a  series  of  subjects.  The  subject-object  recog¬ 
nition  routine  operates  within  the  limits  of  a  particular  clause  as 
defined  by  the  clause  boundary  pass,  searching  for  both  subject 
and  object  in  order  of  most  frequent  occurrence.  The  search 
for  a  subject  is  begun  in  the  segment  preceding  the  predicate  and 
if  unsuccessful,  continued  in  the  segment  extending  from  the  pred¬ 
icate  to  the  next  clause  boundary  or  end  of  sentence;  in  searching 
for  an  object,  the  order  of  segments  searched  is  reversed.  When 
a  subject  or  object  is  encountered,  it  is  marked  as  to  syntactic 
function  and  transferred  to  an  extended  nominal  blocking  pass  which 
sets  the  limits  of  the  particular  subject  or  object  block. 

If  the  predicate  is  plural,  the  subject  search  does  not  conclude 
when  one  subject  is  found,  but  transfers  control  to  a  subroutine 
which  searches  for  possible  additional  subjects.  The  immediate 
context  of  the  current  subject  is  first  investigated  to  determine 
whether  commas  or  ambiguous  conjunctions  are  present.  If  such 
coordinators  are  found,  a  search  for  an  additional  subject  is  made; 
if  no  commas  or  conjunctions  are  found,  the  search  is  abandoned. 

In  the  case  of  a  singular  subject  and  a  plural  predicate  a  "trouble" 
signal  is  stored  if  no  additional  subjects  are  encountered.  When 
additional  subjects  are  found,  they  are  marked  as  subjects  and 
processed  by  the  subject  blocking  routine. 
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TRANSFORMATION  PROGRAMS  FOR  WORD 
ORDER  ARRANGEMENT  IN  THE 
ENGLISH  TRANSLATION 


Since  the  functional  load  of  word  order  in  inflected  language* 
(such  as  Russian)  is  much  lower  than  in  non-inflected  languages 
(such  as  English),  it  is  clear  that  an  adequate  machine  translation 
program  must  provide  for  the  resolution  of  word  order  discrepan¬ 
cies  between  the  source  and  target  languages.  In  our  1961  report 
(Machine  Translation  Studies  of  Semantic  Techniques  AF  30(602)- 
2036),  we  discussed  in  detail  existing  programs  for  three  types  of 
word  order  rearrangement  in  the  English  translation.  These  three 
types  were  described  as  subject-object  rearrangement,  rearrange¬ 
ment  of  governing  modifier  packages,  and  rearrangement  of  auxil¬ 
iaries  and  modals  within  predicate  blocks.  The  first  type  of 
rearrangement  is  the  most  complex,  since  it  involves  the  identifi¬ 
cation  of  more  syntactic  elements  and  reshuffling  of  large  blocks 
of  text.  Furthermore,  in  order  to  operate  successfully  within 
multiclause  sentences,  it  requires  a  clean  definition  of  boundaries 
between  clauses.  At  the  time  of  that  report,  large-scale  rearrange¬ 
ments  could  be  effected  only  on  single-clause  sentences,  as  there 
was  no  systematic  way  of  identifying  separate  clauses.  It  was 
possible  to  rearrange  the  English  translation  of: 

b  cpeae  abhxbtca  nywoic  lapxxeHHia  vacTim 
(In  the  medium  moves  the  beam  of  charged  of  actives  particle^ 
to  read: 

The  beam  of  charged  particles  moves  in  the  medium; 
or  to  rearrange  the  English  translation  of: 

nywic  sapaxeHBUx  uacTmj  oCvhchjui  npo$eccopT 
(The  beam  of  charged  particles  explained  the  professor) 
to  read: 

The  professor  explained  the  beam  of  charged  particles. 
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If  an  additional  clause  were  added,  the  sentence  would  read: 


□yttOK  3&P&K6 HHux  uacTHu  oCihchh*  npo$eccop ,  a  CTyxeHT 
He  ofipamaji  BHHuaiute 

The  beam  of  charged  particles  explained  the  professor,  but 
the  student  wasn’t  paying  attention. 

Then  the  first  clause  could  not  be  handled  by  the  word  order  trans¬ 
formation  program  because  there  was  no  program  for  determining 
whether  the  ambiguous  conjunction  "a"  connected  words,  phrases, 
or  clauses. 

Since  the  decisions  of  the  clause  boundary  routine  discussed 
above  are  now  one  of  the  information  inputs  to  the  rearrangement 
program,  two  separate  searches  for  possible  rearrangement  require¬ 
ments  are  made  on  the  sample  sentence:  one  on  the  first  clause 
(the  segment  between  the  sentence  beginning  and  the  conjunction); 
the  next  on  the  second  clause  (from  the  conjunction  to  the  period). 

If  a  requirement  for  rearrangement  is  revealed  by  search  of  the 
first  segment,  the  English  translation  of  that  segment  is  rearranged 
as  specified,  independently  of  the  second  clause,  which  is  not  checked 
for  a  rearrangement  requirement  until  the  rearrangement  require¬ 
ment  of  the  first  clause  has  been  satisfied.  Thus,  the  resulting 
English  translation  of  the  sample  sentence  is: 

The  professor  explained  the  beam  of  charged  particles, 
but  the  student  wasn't  paying  attention. 
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SYNTHESIS 


Several  steps  have  been  taken  in  this  area  to  achieve  some¬ 
thing  significantly  beyond  a  modified  word  by  word  translation  by 
a  concentration  on  the  translation  algorithm  (and  translating  trans¬ 
formation).  With  a  very  few  exceptions  in  the  past  some  Russian 
constructions  (i.  e.  idioms)  have  been  translated  into  corresponding 
English  constructions.  Because  English  and  Russian  display  certain 
syntactic  similarities,  this  policy,  though  often  awkward,  is  usually 
comprehensible.  In  conjunction  with  the  semantic  research  previ¬ 
ously  described,  exploratory  work  was  undertaken  in  this  area. 

As  a  result  rules  for  the  handling  of  Russian  constructions  contain¬ 
ing  moxho  and  cjtexyeT  were  devised  and  applied  to  new  text. 

The  rules  proved  very  successful  and  serve  to  render  what  were 
previously  tricky  and  awkward  constructions  into  idiomatic  English. 
We  felt  this  was  ample  justification  for  programming  these  rules. 
This  activity  is  so  recent  that  it  is  not  reflected  in  the  sample  out¬ 
put  in  the  appendix  of  this  report. 


MACHINE  TRANSLATION  OF 
STYLISTIC  DIFFERENCES 


In  connection  with  our  preparations  for  the  machine  transla¬ 
tion  of  fiction  we  have  given  considerable  thought  to  the  problem 
of  the  automatic  detection  and  correct  rendition  of  stylistic  differ¬ 
ences.  At  the  present  stage  of  our  research  we  are  able  to  specify 
two  areas  in  which  we  shall  ultimately  be  able  to  do  this.  One  is 
the  stylistically  correct  use  of  vocabulary  which  is  part  of  the  over¬ 
all  problem  of  multiple  meaning  to  which  the  research  under  the 
present  contract  is  addressive.  This  aspect  of  the  stylistic  problem 
is  not  essentially  different  from  other  problems  of  multiple  meaning 
resolution.  The  second  aspect  of  style  is  related  to  the  stylistic 
function  of  syntactic  elements.  We  have  given  consideration  to  the 
well-known  fact  that  the  position  of  subjects  and  objects  with  respect 
to  the  predicate  have  a  definite  function  in  the  Russian  sentence,  in 
that  there  is  a  tendency  for  the  initial  position  in  the  sentence, 
whether  it  is  filled  by  subject  or  object,  to  indicate  old  information, 
while  there  is  a  tendency  for  the  final  position,  whether  it  is  filled 
by  subject  or  object,  to  indicate  new  information.  This  stylistic 
function  of  position  within  the  sentence  can  be  preserved  in  the 
English  translation  and  can  contribute  to  producing  machine  transla¬ 
tion  which  is  not  only  intelligible  but  also  stylistically  more  closely 
comparable  to  the  original.  At  this  stage  we  are  exploring  two 
possibilities  for  the  rendition  of  this  stylistic  value  of  position  into 
English. 

One  possibility  is  the  use  of  passive  translations  for  Russian 
active  sentences  whenever  in  the  Russian  original  the  subject  is  in 
the  sentence-final  position,  indicating  that  it  constitutes  new  infor¬ 
mation.  In  this  case,  the  Russian  subject  would  be  rendered  by  the 
English  agent  object  which  likewise  would  be  in  the  final  position, 
and  hence  equally  indicative  of  new  information.  Our  assumption 
here  would  be  that  the  passive  English  sentence  is  the  stylistic 
equivalent  of  the  Russian  active  sentence  with  inverted  order. 
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As  an  example,  let  us  take  the  following  Russian  sentence: 
"Kaxoti  orpouHuft  txyTb  b  CBoeu  paasxTMM  npoua  Kaaa  CTpaxa 
aa  copox  ab&  roAa  nocjje  CBepaenaa  buoti  xanaraJiitCTOB  a 
noaenaxoB  a  ycTaHOBJtemui  cobbtcxoH  pafioue-KpecTbXHCicolt 
BJiacTH!  It  stems  from  the  lead  article  "BecnpiuiepHbdt  HayuHkdt 
nOABHr”  (Unparalleled  Scientific  Advance)  in  Pravda  of  27  October 
1959,  dealing  with  the  Soviet  moon  shot  and  the  pictures  taken  of 
the  far  side  of  the  moon.  In  the  light  of  our  interpretation  of  style, 
we  may  assume  that  the  object  of  this  sentence,  "Kaxott  orpoMHbdt 
nyTfc. .  .  "  (What  enormous  path. .  . )  is  old  information,  in  the 
light  of  the  preceding  discussion  of  the  great  achievements  reported 
on,  and  as  new  information  is  presented  the  subject  NHaaa  CTpaaa " 
(our  country)  and  particularly  the  final  predicative  complement 
"3a  copox  ABa  roAa.  . .  "  (after  forty-two  years. .. ).  Clearly, 
then,  the  most  idiomatic  translation,  taking  the  stylistic  function 
of  the  sentence  portions  in  the  Russian  original  into  account  and 
rendering  them  equivalently  in  the  English  would  involve  a  change 
from  the  active  to  the  passive  voice,  and  a  retention  of  the  order 
in  which  the  semantic  components  of  the  sentence  appear  in  Russian. 
Thus,  instead  of  the  rearrangement  which  our  present  program  aims 
at,  namely,  "Our  country  in  forty-two  years  after  the  overthrow  of 
the  rule  of  the  capitalists  and  landowners  and  the  establishment  of 
the  workers'  and  peasants'  rule  passed  what  enormous  path  in  its 
development,  "  we  shall  aim  at  the  following  translation  which  pre¬ 
serves  the  original  order  by  changing  the  voice  of  the  English  verb 
from  active  to  passive:  "What  enormous  path  of  development  was 
passed  by  our  country  in  forty-two  years  after  the  overthrow  of  the 
rule  of  capitalists  and  landowners  and  the  establishment  of  the 
workers'  and  peasants'  rule!" 

The  second  possibility  is  to  consider  the  function  of  the 
English  indefinite  article  or  absence  of  the  article  in  the  plural  as 
an  indication  of  new  information.  In  this  case  we  would  give  con¬ 
sideration  to  the  possibility  of  using  the  indefinite  article  or  np 
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article  when  the  corresponding  Russian  sentence  element  occupies 
the  final  position.  An  example  would  be  the  sentence  "KaatJloe 
M3  3TMX  AOCTHXSHMfi — fieCXXpKUepHUtt  Hay^HUtt  nOABMr  t  "  taken 
from  the  same  document.  If  we  translate  the  final  element  of  this 
sentence,  the  predicative  complement  "SecnpxiiepHUft  HayttHtift 
noflBHr,"  using  the  indefinite  article  in  English,  we  shall  come 
closest  to  an  idiomatic  translation:  "Each  of  these  achievements 
is  an  unparalleled  scientific  advance.  " 

This  research  is  as  yet  in  the  exploratory  stage. 
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SEMANTIC  RESEARCH  USING 
THE  USHAKOV  DICTIONARY 


The  utilization  of  examples  of  usage  given  in  the  Ushakov 
Dictionary,  which  was  proposed  for  the  present  research,  has 
been  attempted.  Unfortunately,  in  the  tests  which  we  conducted, 
using  about  100  entries,  the  majority  of  the  examples  given  in  the 
Ushakov  do  not  allow  multiple  meaning  due  to  different  subject 
matter  field.  These  differences  are  not  amenable  to  the  determiner* 
determinee  treatment  which  we  are  at  present  in  a  position  to 
implement. 
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TEST  SENTENCES  FOR  SYNTAX 
FLOW  CHARTS 


In  order  to  facilitate  manual  and  machine  checkouts  of  the 
syntax  routines,  a  large  number  of  test  sentences  (189)  were 
created  to  serve  as  tools  in  this  process. 

The  aim  of  these  checkouts  is  different  from  that  of  the 
processing  of  live  text  for  testing  purposes.  The  processing  of 
live  text  tests  whether  or  not  our  syntax  rules  cover  all  significant 
conditions  of  Russian  sentence  structure.  The  purpose  of  our  test 
sentences,  on  the  other  hand,  is  to  test,  not  the  adequacy  of  our 
rules,  but  whether  or  not  the  program  actually  carries  out  these 
rules  as  intended. 

A  test  sentence  was  created  to  checkout  each  significant 
branch  of  every  flow  chart.  This  means  that  for  every  question 
asked  on  a  flow  chart,  at  least  two  sentences  were  created,  one 
corresponding  to  the  yes  answer,  one  to  the  no  answer.  We  could 
thus  assume  that  whenever  a  failure  occurred  with  a  particular 
test  sentence,  there  was  a  great  likelihood  that  this  failure  would 
be  due  to  the  branch  on  the  flow  chart  which  this  sentence  was  in¬ 
tended  to  test. 

Since  the  test  sentences  were  designed  for  purposes  of  the 
syntax,  they  were  as  much  as  possible  deliberately  restricted  to 
the  syntactic  conditions  which  they  are  intended  to  test.  In  particu¬ 
lar  we  wanted  to  avoid  complicating  the  tests  by  introducing  more 
than  the  minimum  necessary  number  of  dictionary  entries.  Con¬ 
sequently,  to  display  the  syntactic  difference  to  maximum  advantage, 
the  same  dictionary  words  were  used  over  and  over  again  in  the  test 
sentences,  bringing  about  a  somewhat  unnatural  impression.  Since 
some  of  the  test  sentences  were  introduced  to  test  flow  chart  exits 
marked  "notice  of  error",  these  particular  sentences  were  designed 
to  be  grammatically  incorrect.  An  example  is  sentence  No.  2  for 
testing  the  relative  pass. 
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We  wish  to  illustrate  the  creation  of  the  test  sentences  by 
giving  some  examples  of  sentences  pertaining  to  Homograph  Resolu¬ 
tion  Pass  HR  1.  (Homographs  of  the  type  NTpyA80,  coraacHO,  " 
called  predicative-adverb-preposition  homographs. )  Test  Sentence 
No.  1  illustrates  the  yes  answer  to  the  question  "Is  the  predicative- 
adverb- preposition  homograph  immediately  preceded  by  a  preposi¬ 
tion  and  immediately  followed  by  a  modifier  ?'  Note  that  this 
sentence  contains  a  homograph  (npjnio)  in  the  required  position 
between  a  preposition  (oT)and  a  modifier  (pexyppOHTHUx) • 

This  test  sentence  is  correctly  translated  if  the  homograph  is 
rendered  by  an  English  adverb. 

Test  Sentence  No.  14  on  the  other  hand  illustrates  a  yes 
answer  to  the  question  "Is  predicative-adverb-preposition  homo¬ 
graph  immediately  followed  by  a  governed  block?"  and  a  no  answer 
to  the  question  "Is  predicative-adverb-preposition  homograph 
immediately  preceded  by  a  comma  ?'  This  test  sentence  is  cor¬ 
rectly  translated  if  the  homograph  is  rendered  by  an  English 
predicate. 

Test  Sentence  No.  15  differs  from  Test  Sentence  No.  14  by 
having  a  yes  answer  to  the  question  "Is  predicative-adverb- 
preposition  homograph  immediately  preceded  by  a  comma?"  which 
is  in  turn  followed  by  a  yes  answer  to  the  question  "Is  governed 
block  immediately  followed  by  a  comma  ?'  This  sentence  is  cor¬ 
rectly  translated  if  the  homograph  is  rendered  by  a  preposition. 

Each  time  a  change  is  made  in  the  program,  the  test  sen¬ 
tences  are  run  in  order  to  ascertain  the  effects  of  that  change. 

The  resulting  vertical  listing  is  then  available  for  analysis.  This 
is,  however,  not  enough.  It  does  not  take  into  account  the  possi¬ 
bility  that  a  change  which  has  produced  an  improvement  in  one 
place  in  the  program  may  inadvertently  result  in  a  change  for  the 
worse  elsewhere. 

Consequently,  each  time  a  change  is  made  in  the  program, 
the  test  sentences  are  run,  and  in  addition  to  printing  them  out 
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directly,  the  information  tape  of  the  current  run  is  automatically 
compared  to  the  information  tape  of  the  latest  previous  run  of  the 
same  sentences.  This  comparison  is  conducted  bit  by  bit.  If  any 
discrepancies  are  detected,  a  vertical  listing  of  the  full  sentence 
is  selected  for  output  and  each  word  which  displayed  a  discrepancy 
is  flagged  for  the  benefit  of  the  analyst. 

The  inventory  of  test  sentences  can  be  expanded  when  nec¬ 
essary.  The  flow  chart  of  the  test  sentence  comparer  on  the 
following  page  indicates  the  provisions  that  have  been  made  for 
them. 
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Teat  Sentences  for  Symbol  Pass 


1  . 
2. 

3. 

4. 

5. 

6. 
7. 


8. 


9. 
10. 
1 1 . 
12. 

13. 

14. 


cr'tacTimbi  upohhk&dt  nepea  hapo . 
BaxHbie  X  npoHHKBDT  vepea  hapo . 


BaxHbie  MacTimu  npoHKanT  uepea  21. 

Heaban  a h&t t ,  noTony  hto  21  npoHHKaeT  nepea  j«po. 
Mac thu a  2—  npoHHKaeT  nepea  hapo. 

IIocToaHHaji  npoHHKaeT  ^epea  hapo. 

npoHHKaeT  qepea  hapo . 

IlaAeHHe  HaxHiia  h  21  npoHHicasT  qepea  hapo . 
Coa^auBtaa  21  MacTHna  npoHHKaeT  qepe3  hapo . 
CoaxaHHoe  21  npoHHKaeT  qepea  hapo . 

MacTHna  npoH3BOAHT  21. 

IIpoh3boah^,  aacTHna  npoHHKaeT  qepe3  hapo. 
MacTHna  iioxeT  npoHBBOAHTb 21 . 

MacTHqu  h  ^npoHHKaBT  qepea  hapo. 


Teat  Sentences  for  Homograph  Resolution  Pass:  type  "TpyAHO,"  "coraaCHo" 

1.  Hama  peayabTaTM  npoHexoAHT  ot  npauo  pexyppeHTHux  mbtoaob. 

2.  HBaeHHe  npoHaonuio  coraacHO  HaoHM  peayabTaTa* . 

3.  HBaeHHe  npoH3omao  MexAy  npoMHM  coraacHO  Hamim  pe3yabTaTa»i . 

4.  IIpoh3boah  pe3yabTaTH  coraacHO  Haxeiiy  npeanoaoxeHH*) ,  onuTM 

npoH3oataH  6ea  TpyAHOCTeft. 

5.  HBaeHHe  npoHaooao  hcho. 

6.  HBaeHHe  npoHaooao  uexAy  npoqiw  hcho. 

7.  IIpoHasoAH  pe3yabTaTH  coraacHO,  onuTbi  npoH3omaH  6ea  TpyAHOCTeft. 

8.  HsaeHHe  6aao  coraacHO  c  uaauu  npeAnoao xeaxeu. 

9.  HBaeHHe  coraacHO  Ouao  c  HaxHx  npeAnoaoxeaueM. 

10.  HBaeHHe  OyAeT  coraacHO  c  Haxuu  npejnoaoxeHHeM. 

1 1 .  TpyAHO  npeAnoaaraTb  thkho  peayabTaTM. 

12.  TpyAHO  Han  npeAnoaaraTb  Taxae  peayabTaTM. 

13.  npeAnoaaraTb  Han  TpyAHO  Taxxe  peayabTaTM. 

14.  Taxoe  HBaeHHe  coraacHO  khohm  peayabTaTaM. 
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Teet  Sentence*  for  Homograph  Reeolution  Pair  type  "Tpyxao ,  "  "oorxacxo 

15.  Tan*  on  wry,  corjaoxo  luwy  apaAnoxoaaxxa,  npoxaoxxx  Oaa 

TpyAHOorai. 

16.  PaOoTa,  npoacxoAxaax  aoboxbbo  Tpyx«o ,  ooaepaaeTOit  oeroAxa. 

17.  Baxhim  xoxo  xexsBecTxue  peayAbTipru  npoaoxojwT  Oas  TpyAXOcrei. 

18.  Scxo,  mto  Heats*. 

19.  Homo,  am  TaxoM  naan  Heats*. 

20.  Taxaa  onuru,  *cho,  npoxoxoAXT  Oaa  TpyAxocTatt. 

21.  Taxoe  abmh««  *cho  ,  a  xax  onur  npoasotUeT  Oaa  TpyAxootait. 

22.  Taicoa  xaieiut  OaspasaxHxo  t  pyx  ho  . 

23.  flciio  HJaecTHult  pasyatTax  npoxoxoAXT  6ea  TpyAMoexaft. 

24.  CoaepaeHCTBOBaHae  xaxxx  pesyaxaxoB  onext  t pyx ho . 

Teat  Sentence*  for  Homograph  Re«olutlon  Pbbb:  #ro,  aa ,  hx 

1 .  PaOoxa  npoHsoxaa  nocae  aro  ohbhb  aasaux  ycxAxit. 

2.  npexnoxoxBHHa  xx  onext  aaxHti. 

3.  CoTpyxHxx  ooxpauaBT  aa. 


Teat  Sentence*  for  Homograph  Reaolution  Pa**:  type  "nocxoxxHax” 

1.  nocToxKxa*  TpyxH&a  paOova  xaaBCTpxpyex  uiHb. 

2.  Xxaub  xxxnoTpxpyaT  nooToxxxtix  qxtb  paOor. 

3.  Cxexynxea  oaoao  npxHaxxaxxt  npotpeocopy. 

4.  Coipyxxxx  cxaaaa  oa*Ay»aee. 

3.  Cxaxycxaa  xaxyn  paOoxy  noxxTxe  npxxxMaeTC*. 

6.  nooToxxxax  QpxxxxaaT  paaxua  axaMaxx a. 

7.  Haxx  Aoxxaxu  xbjuodtox  yxaxtaix. 

8.  Haxx  aokjmuu  ganxoaxy  yxaxMix. 

9.  Haxa  paOoia  npoaaxaxa  yqaxtat. 

10.  Haifa  paOota  npxBOAXT  x  yaaxuM. 


Test  Sentences  for  Homograph  Reeolution  Pa«»:  type  "nOGTOXXXaJI* 


1 1 .  Hana  paOoTa  npxBOjxT  x  nocTOXHXott. 

12.  Haxa  pafioTa  bcxojbt  ot  nocToxxxott. 

13.  Haaa  paOota  aaaaeTC a  nocToxHxott. 


Teat  Sentences  for  Nominal-  and  Prepoaitional-Blocking  Paee 

1 .  Bonpoc  oTxpfciroro  oxxa  paaaaTOx  oaroAxx. 

2.  Bonpoc  oTxpuroS  paxaTxn  paaaaTOx  oeroAX*. 

3.  Asa  oTxpuThtx  oxaa  bsucoaxtcx  ajecb. 

4.  Abb  oTxpurux  paxaTxx  xaxoxxTOx  axaob. 

5.  OTxpuroe  a  saxpwTo#  oxxa  xaxoAXToa  ajacb. 

6.  Hexbsx  pafioTXTb  8ea  OTxpwroro  oxxa. 

7.  Hexbax  pafioTaTb  6ea  OTxpurux  oxxa  x  pexeTxx. 


Teat  Sentence ■  for  Inaerted-Structure  Paae 

1 .  Kobombo ,  xacTxnu  upoxxxsuoT  xepe*  xxpo . 

2.  CeroxHx,  xacTxqa  npouua  xapaa  xxpo. 

3.  MaoTxnu,  xoxaxxo,  npoxxxaBT  xepaa  xxpo. 

4.  MacTxna,  caroxxx,  npomza  xepaa  xxpo. 

5.  BooCae  roBopx,  xacTxnu  npoxxxaBT  vepea  xjtpo. 

6.  HacTxnu,  BooOae  roBopa,  npoxxxaBT  uepea  xxpo. 

7.  HacTxubi,  xax  sea  axast,  npoxxxaBT  xapea  ajpo. 

8.  Baxxxe,  Taxxx  cnocofiox,  xaoTxqx  npoxxxaBT  xapaa  ajpo. 

9.  Baxxue,  no  xaaaxy  npajMzoxoxaxxB ,  xaoTXQu  npoxxxaBT  xapaa  ajpo. 

10.  MaoTxnu  npoxxxaBT  xapea,  Taxxx  onooofiox,  ajpo. 

11.  HacTxqu  npoxxxaBT  xapea,  no  xaaexy  npaxnojioxauBfi ,  ajpo. 

12.  CaroAxa,  xaoTxnx  npoaxa,  xoxaxxo,  xapaa  ajpo. 

13*  Koxaxao,  xaoTxnu  npomax,  oaroAxa,  xapaa  ajpo. 

14.  Koxsvho,  uacTxqu  apoxxxaioT,  aooOme  roaopa,  «tapaa  aAPO . 

15.  MaoTxqu,  xoxaxxo,  npoxxxaBT  xapaa,  no  xaxaxy  npaxnoxoxaxxB, 

axpo. 

16.  Mapaa,  no  xaaaxy  npaxnoaoxaxxB,  xjpo  xyxxoxx,  xoxaxxo , 

npoxxxaBT . 
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Test  Sentences  for  Governing- Modifier  Pass 


1 .  Bpansuonnec*  xacxaxH  axyT  Bnepex • 

2.  Kacaaxu,  apaxaeiaecx  soapy r  xxpat  axyT  anepex. 

3.  Kacaaxn  ayxxeoHoa,  apasasaxeea  aoapyr  xxpa,  axyx  Bnepex. 

4.  Bpamasaaeca  aoapyr  xxpa  aacaaxu  axyT  anepex. 

5.  BpamacmuecH  aoapyr  axpa  aacaaxu  HyaaeoHoa  axyT  anepex. 

6.  Bpaansaaeca  aoapyr  axpa  Sbicrpue  aacaaxu  axyT  anepex. 

7.  Bpaaasaaeca  Boapyr  axpa  Sucrpue  aacaaxu  HyaaeoaoB  axyT  Bnepex. 


"aOTOPHtt"  pass — teat  sentences 

1.  Kacaaxa,  aoTopue  axyT  anepex ,  oqeaa  aeaaaa. 

2.  Kaoaaxu  aoTopue  axyT  Bnepex  oqeaa  Beaaaa. 

3.  Kacaaxa,  aoTopue  anepex,  oqeaa  aeaaaa. 


Teat  Sentences  for  Main  Syntax  Pass — Key  To  Number  Code 

X.  0.0.0.  Predicate 

O.X.  0.  0.  Subject 

0.0.  X.  0.  Object 

0.  0.  0.  X.  Eliminated  candidates  for  subject  or  object 


1. 0.  0.  0. 
2.  0.0.0. 
3. 0.0.0. 

4.  0.  0.  0. 

5.  0.0.0. 

6.  0.0.0. 
7. 0.0.0. 
8.  0.0.0. 
9.  0.0.0. 


Predicative  plural  non-past 
Predicative  followed  by 
Predicative  plural  past 
Predicative  singular  non-past 
Predicative  singular  past  feminine 
Predicative  singular  past  masculine 
Predicative  singular  past  neuter 
Predicative  plural  followed  by  0HTfc 
Predicative  plural  followed  by  6m  h  followed  by 
second  predicative. 
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Test  Sentence*  for  Main  Syntax  Pin — Key  to  Number  Cod* 


10.0.0.0. 
11.0.0.0. 
12.0.0.0. 
13.0.0.0. 
14.  0.0.  0. 
15.0.0.0. 
16.0.0.0. 
17.0.0.0. 


Predicative  is  Sum  and  ie  followed  by  second  predicative 
Predicative  is  6yxyr  and  is  followed  by  second  predicative 
Predicative  is  a  comparative. 

There  is  no  predicative,  buteoJU  followed  by  infinitive. 
There  is  no  predicative,  but  6CJIB  6u  followed  by  infinitive. 
Predicative  is  followed  by  infinitive 
No  predicative,  but  gerund 

No  predicative,  but  gerund  followed  by  infinitive 


0.  1.0.0. 
0.2. 0.0. 

0.  3.0.  0. 

0.4.0.  0. 
0.5.0.  0. 
0.  6.0.0. 
0.7.0.  0. 
0.8.0.  0. 
0.9. 0.0. 


Unambiguous  nominative  plural 

Genetive  singular/nominative-accusative  plural  ambiguity 
reduced  by  unambiguous  modifier 
Genetive  singular/nominative-accusative  plural  ambiguity 
reduced  by  nominative  numeral 
Potential  nominative  singular,  any  gender 
Potential  nominative  singular,  feminine 
Potential  nominative  Singular,  masculine 
Potential  nominative  singular,  neuter 
No  subject,  but  predicative  can  be  impersonal 
No  subject,  predicative  can  not  be  impersonal — rearrange¬ 
ment  required 


0.  0.  1.0. 

0.  0.  2.  0. 
0.  0.3.0. 
0.0.4.  0. 


0.0.  5.0 


Predicate  governs  accusative,  nominative/accusative 
potential  object 

Predicate  governs  genetive,  genetive  potential  object 
Predicate  governs  accusative  plural  potential  object 
Predicate  governs  accusative,  genetive  singular/nominative- 
accusative  plural  potential  object,  ambiguity  resolved  by 
numeral 

Predicate  governing  accusative  contains  negative  particle, 
nominative/ accusative  potential  object 
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Teit  Sentences  for  Main  Syntax  Pan — Key  to  Number  Code 


0.0.6.  0. 

0.0.  7.0. 

0.0.8.  0. 

0.  0.9.0. 


Predicate  governing  accusative  contains  negative  particle, 
genetive  potential  object 

Predicate  governs  instrumental,  potential  instrumental 
object 

Predicate  governs  dative,  potential  dative  object 

Predicate  governs  preposition,  prepositional  block  present 
as  potential  object 


0.  0.  0.  1. 


0.0.0.  2. 


0.0.0.  3. 


0.0.0.  4. 


0.0.  0.5. 


0.0.0.  6. 

0.  0.  0.  7. 


0.  0.  0.  8. 


0.0. 0.9. 

0.0.0.  10. 

0.0.0.  11. 

0.  0.0.  12. 


Nominative /accusative  plural  candidate  for  subject 
eliminated  because  of  preceding  preposition 
Genetive  singular/nominative-accusative  plural  candidate 
for  subject  eliminated  because  of  preceding  preposition 
Genetive  singular/nominative-accusative  plural  candidate 
for  subject  eliminated  because  of  preceding  modifier 
Singular  candidate  for  subject  of  unspecified  gender 
eliminated  because  of  preceding  prepositions 
Feminine  singular  candidate  for  subject  eliminated  because 
of  preceding  preposition 

Masculine  singular  candidate  for  subject  eliminated  because 
of  preceding  preposition 

Neuter  singular  candidate  for  subject  eliminated  because 
of  preceding  preposition 

Nominative/accusative  candidate  for  object  eliminated 
because  of  preceding  preposition 
Genetive  singular/nominative-accusative  plural  candidate 
for  object  eliminated  because  of  preceding  modifier 
Genetive  candidate  for  object  eliminated  because  of 
preceding  preposition 

Genetive  candidate  for  object  eliminated  because  of 
preceding  other  nominal 

Instrumental  candidate  for  object  eliminated  because  of 
preceding  preposition 
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Te»t  Sentences  for  Main  Syntax  Pan—— Key  to  Number  Code 

0.  0.  0. 13.  Instrumental  candidate  for  object  eliminated  because  of 

preceding  other  nominal 

0. 0.  0. 14.  Dative  candidate  for  object  eliminated  because  of  preceding 

preposition 

0.  0.  0. 15.  Dative  candidate  for  object  eliminated  because  of  preceding 

other  nominal 

Number  followed  by  "a  nominal  block  includes  modifiers 


Test  Sentences  for  Main  Syntax  Pass 


1.1.1. 

1 .iM.lM. 
1.1. 1.1. 
1.1.1.1m. 
1.1. 1.2. 
1.1.1 ,2m. 

1 .1 .1 .3. 

1 .1.1.8. 

1 .1 .1 .9. 

1 .1 .2. 

1 .1.2.10. 
1.1.2.11. 

1.1.3. 
1.1. 3. 8. 
1. 1.3.9. 

1 .1 .4. 


ToBapuaw  yBexauasaBT  cob*. 

Haas  to  sap  van  yaeaBUMBaBT  coaaaxBCTBuecxaJt  cons . 
Repes  pattOHH  TOBapaaa  yBsaauHBaxT  cons. 

Kepe*  Soxbute  pattoau  Tosapaaa  yBexauasavT  cob*. 
Hepe*  conpoTKBxeHaa  Tosapaaa  ysexauaBaBT  cob*. 
Hepe*  Coabaae  conpoTHsxeium  TOBapaaa  ysexatiasaBT 
COBS. 

B  CMucae  raxoro  conpoTBBxeHaa  TOBapaaa  yBexauasajoT 
cobs  . 

TOBapaaa  ysesBUBsaBT  uepes  pattoH  cob*. 

TOBapaaa  ysexauasaBT  b  caucxe  Taxoro  conpoTBBxeaaa 
cob*  . 

TOBapaaa  aocTaraaT  ycaaaft. 

Tosapaaa  xocTaraar  6ea  Tpyaaocrefi  ycaxaft. 

Tosapaaa  jociaraBT  nocTaaoBXott  Tpyxaocrett  ycaxatt. 
Tosapaaa  yBexauasaBT  noTepa. 

Tosapaaa  ysexauasaBT  uepea  pafioH  noTepa. 

Tosapaaa  yBexauassuoT  a  caucxe  Taxoro  cospoTXBxaaaa 
sorepa. 

Tosapaaa  ysaxauasaBT  as*  sot spa. 
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Teat  Sentence ■  for  Main  Syntax  Pa»« 


1.1.5. 

1.1.6. 

1 .1 .7. 

1 .1 .7.12. 

1 .1 .7.13. 

1 .1 .8. 

1.1.8.14. 

1 .1 .8.15. 

1.1 .9. 

1 .2.1 . 

1 .2.1.1 . 

1 .2.1.2. 

1 .2.1.3. 

1.3.1. 

1.3. 1.1. 

1 .3.1.2. 

1.3. 1.3. 

1.9. 
2.1.1. 

2 . 1m. 1m. 
2. 1.1.1. 
2. 1.1. 2. 

2. 1.1. 3. 

2.2.1. 

2. 2. 1.1. 
2. 2. 1.2. 

2. 2. 1.3. 


ToaapHXx  He  yBexHHXBaBT  cobb. 

Tob&pkw  hb  yaejMtiHBaDT  cosaa. 

To  Bap  h  mH  cayxar  TexHHxaua. 

ToBapaam  cayxaT  c  nonpaBxoft  TexHaxaiui. 

ToBapnaui  QJiyxaT  Sea  ynpaBaeHxa  ooctoxhhbm  TexHHxaux. 
TosapaHH  oTBenajoT  coTpyAHHxau. 

ToBapHHH  oTBenauT  no  cocTasy  coTpyAHHxau. 

ToBapuiH  OTBevaiOT  Sea  aoATBepxABHHx  coTpyAHHxau 
cocraBy. 

ToBapHHH  oTBenacT  Ha  Taxott  eonpoc. 

Taxxe  conpoTHBxeHHa  yBexHHuaacT  cobs. 

Mepea  paltoHu  Taxxe  coxxpotxbabhhx  yBeaaaHBaBT  oobs. 
Hepea  conpoTHBXBHHx  T&xi&e  napraa  yBexH^HBeuoT  cosa. 

B  cuucjte  Taxoro  conpoTHBxeHxa  Taxxe  napTxx 
y B 6XH HHBBBT  COBB. 

Abb  conpoTHBxeHKx  y b 6xhhhb8bt  cobb. 

Hepe3  pattOHU  abb  conpoTHBaeiuM  yBexxaHBaBT  cosa. 
qepea  coQpoTXBaeHxx  abb  napTxx  y b exx ^hbsjdt  cobb. 

B  cuucae  Taxoro  conpoTHBxeHXx  xse  napTxx  yaeaxHHBaBT 

C0B3  . 

Cobb  y b  exHHHBaBT  TOBapHXH , 

ToBapxxx  cpaBHHBaxH  Su  mbtoa. 

HaXH  TOBapHXH  CpaBHXBELXH  Sbl  HayMHUft  MBTOA. 

Hepea  npauepu  TOBapaxa  cpaBaxaaaH  Su  mbtoa. 

Hepea  paaxoxeHxa  TOBapxxx  cpaBHHBaxH  Su  mbtoa. 

B  CMuoxe  Taxoro  paaxoxeHXx  TosapaxH  cpaBHXBaxx  Su 
MBTOA. 

Taxxe  coqpothbabxhx  npoaeax  Su  mbtoa. 

Mepea  npauepu  Taxxe  ooqpothbabhhh  nposexH  Su  mbtoa. 
Hepea  coqpothbabhhh  Taxxe  npauepu  npoaexa  Su  mbtoa. 

B  CMuoae  Taxoro  oobpothbabkxx  tbxhb  npauepu  npoaexH 
Su  mbtoa. 
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Teat  Sentence*  for  Main  Syntax  Pa  a  a 


2.3.1. 

2.3. 1.1. 

2. 3. 1.2. 

2. 3. 1.3. 

2.9. 

3.0.1 . 

4.4.1. 

4. 4. 1.4. 
4.4.1 .4ii. 

4.8. 

4.9. 

5.5.1. 

5. 5. 1.5. 
5.5.1.5m. 

5.9. 

6.6.1. 

6.6.1 .6. 
6.6.1 .6m. 

6.9. 

7.7.1. 

7. 7. 1.7. 
7.7.1.7m. 

7.8. 

7.9. 

8.1. 

8.9. 

9.1. 

9.9. 

10.1. 

10.9 

11.1 


Asa  ooapoTxaxaxxx  npossxx  Oh  mbtoa. 

Hapos  npauepu  ab a  conpoTXBxsxxx  npossxx  Oh  mbtoa. 
^epes  COnpOTKBJIOHKA  ABB  HapTKM  npOBSXX  Ou  MBTOA* 

B  CMMOJie  raxoro  conpoTXBxsxxx  abb  nprn  npoasxx 
OU  MBTOA. 

Mbtoa  nposBxx  Ou  Tosapxxx. 

Ohm  cpaBHKBaxm  mbtoa. 

Ilpufop  noxasusasT  mbtoa. 

Mspoa  npHMBp  M«pa  noxasusaaT  mbtoa. 

Mspos  Taxott  npiuisp-  Mspa  noxas ubbbt  mbtoa. 

Buy  yAaoTOx. 

Iloxbay  QoxaauaaBT  mbtoa. 

Mspa  noxaaaxa  mbtoa. 

Mspoa  MBAb  Mopa  noxaaaxa  mbtoa. 

Mepes  npocTys  MBAb  Mopa  noxaaaxa  mbtoa. 

Iloxbsy  noxaaaxa  uspa. 

Coxa  noxaaax  mbtoa. 

Hapaa  npxuep  coxa  noxaaax  mbtoa. 

Hapaa  Taxott  npituep  coxa  noxaaax  mbtoa. 

Iloxbay  noxaaax  mbtoa. 

CpaBBBHXB  noxaaaxo  mbtoa. 

Hspaa  noHxTHB  cpaBneHBs  noxaaaxo  mbtoa* 

Kepaa  tbxob  nonxTxs  cpaBHBXXB  noxaaaxo  mbtoa. 

Bay  yAaxocb. 

Iloxbay  noxaaaxo  opaBxoxxB. 

Tosapxxx  Morxx  Curb. 

Morxx  OuTb  Toaapxxx. 

Tosapxxx  Morxx  Ours  aoxaiaxit. 

Morxx  Ours  noxaaaxu  Tosapxxx. 

Tosapxxx  6uxx  noxaaaxu. 

Buxx  noxaaaxu  Tosapxxx. 

Toaapxxx  fly  Ayr  noxaaaxu. 
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Test  Sentence!  for  Main  Syntax  Pa 11 


11.9. 

Byxyv  noxaaaxu  Tosapaxa. 

12.1.2. 

Tosapaaa  Cucrpee  cobs a. 

12.4.2. 

Mepa  Cbicrpee  atroxa. 

12.9. 

Cossa  Oucrpee  Tosapaxa. 

13.0.1. 

Ecxa  yBeaaMBBaTb  cobs. 

14.0.1. 

Ecxb  6u  ysexa^BBaTb  cobs. 

15.1.1. 

ToBapaxa  aoryr  yBexa^asaTb  cobs. 

15.8.1. 

Eay  yxacTca  yBexanaBaTb  cobs. 

15.9. 

Cobs  aoryr  yBexa^BBarb  TOBapaxa. 

16.0.1 . 

CoaxaBaa  cobs. 

16.0.1 . 

Saxasaacb  coaxaBaTb  cobs . 

Test  Sentences  for  Cleanup  Pass  CP-1 

1  . 

BaxHhie  Tosapaxa  axyT. 

2. 

OMeab  Baxaue  roBapaaa  axyr. 

3. 

He  roBapaaa  axyT. 

4. 

Baxaue  Tosapaxa  aaaax  coTpyxaaxoB  axyT. 

5. 

Oaeas  Baxaue  roBapaaa  aaaax  corpyxaaxoB  axyT. 

6. 

He  Tosapaxa  aaaax  ooTpyxaaxoB  axyT. 

7. 

He  Baxaue  Tosapaaa  axyT. 

8. 

He  Baxaue  Tosapaxa  aaaax  coTpyxaaxoB  axyT. 

9. 

ToBapaxa  bexhux  npo$eccopoB  aaaax  coTpyxaaxoB 

axyT. 

Test  Sentences  for  Cleanup  Pass  CP-2 

1  .  H&C  8XX6Tb  HeXbSJI. 

2.  npocTyn  ittxb  aaxerb  hmmi. 

3.  IIpOCTyB  MBXb  T&XOrO  pox*  BBXBTb  MlkM, 

4.  Taxor  o  pox*  n  poor  ye  aexb  bixbtb  aex0sx. 

5.  Oaeab  npocry®  aexb  bhxbtb  aexbax. 

6.  HoXbBy  BBXBTb  BBXBBH. 


44 


HARDWARE  IMPLICATIONS  OF  THE  RESEARCH 


Machine  translation  research  at  the  RW  Division  has  been  im¬ 
plemented  on  general  purpose  computers.  We  recognise  such  hard¬ 
ware  requirements  as  print  readers,  multi-font  printers,  and  com¬ 
posing  machines. 

We  have  recognized  the  need  for  larger  memories.  This 
recommendation,  however,  has  been  the  general  observance — the 
larger  the  memory  the  easier  the  solution. 

In  the  areas  of  dictionary  search,  we  have  been  able  to  program 
our  way  around  the  fact  that  the  general  purpose  computers  that  we 
have  been  using  for  high-speed  memories  are  too  small  to  contain  our 
lexicon. 

Both  programming  foresight  and  cleverness  were  required  for 
the  general  solution  of  our  ordinary  dictionary  lookup.  However,  if 
one  wishes  that  a  far  larger  memory  had  been  available,  it  becomes 
possible  to  ask  what  new  attributes  other  than  larger  size  are  de¬ 
sired  of  this  memory. 

What  strikes  one  as  the  flow  charts  are  examined,  is  that  the 
need  exists  for  a  totally  different  type  of  computer.  Such  a  computer 
would  have  an  associative  memory. 

The  reasons  which  first  appear  for  an  associative  memory  are 
those  based  upon  the  facts  that  the  data  is  non-numeric.  Such  in¬ 
structions  as: 

find  the  word  in  the  dictionary 
or 

find  the  prefix  of  the  word 
or 

compare  the  end  of  the  word  with  a  possible  list  of  suffixes 

immediately  present  themselves.  Their  uses  are  real;  our  experience 
in  machine  translation,  however,  has  led  up  to  the  point  where  those 
portions  of  the  program  which  require  the  ability  to  recognise  alpha- 
numerical  configurations  require  a  small  portion  of  the  total  running 
time. 


45 


Attention  is  directed  to  the  figures  on  the  following  pages. 

These  charts  are  taken  from  the  I960  RW  machine  translation  report. 
They  illustrate  a  small  portion  of  the  syntactic  program.  Further¬ 
more,  these  charts  represent  the  thinking  of  a  linguist  and  had  not 
as  yet  been  seriously  modified  by  a  programmer's  restatement  based 
on  efficient  usage  of  a  traditional  general  purpose  computer. 

Though  there  is  an  occasional  reference  to  actual  word  forms 
in  these  figures,  most  references  are  to  grammatic  attributes  which 
would  have  been  found  only  after  a  successful  dictionary  lookup. 

In  these  charts,  a  large  number  of  the  boxes  begin  with  such 
words  as  "search,  "  "hunt,  "  or  by  phrases  such  as  "is  nominal  im¬ 
mediately  preceded  by  preposition  skipping  over  modifiers  and  ad¬ 
verbs."  Here  we  have  evidence  of  the  need  for  a  search  type  memory 
past  the  stage  where  the  problem  has  consumed  alphanumeric  data. 

Among  the  important  aspects  revealed  by  the  flow  charts  is  the 
need  for  a  search  to  be  made  within  a  region  whose  boundaries  are 
defined  syntactically.  This  searching  within  a  non- graphically  de¬ 
fined  area  is  one  of  the  dominant  features  of  language  data  processing; 
this  is  made  evident  as  we  examine  different  portions  of  our  bilingual 
programs. 

Syntactic  Analysis 

It  is  generally  recognized  by  all  research  centers  that  automatic 
translation  cannot  be  done  without  performing  detailed  syntactic  anal¬ 
yses  of  the  text.  In  syntactic  analyses  searches  are  made  on  other 
than  lexical  units.  These  searches,  made  upon  grammar  codes  and 
resultant  syntactic  codes,  have  as  part  of  their  nature  a  con¬ 
junction  of  conditions  within  variably  defined  syntactic  boundaries. 

Up  to  now,  the  prominent  methods  of  analysis  have  necessarily  re¬ 
verted  to  standard  computer  techniques.  Yet,  as  indicated  in  the 
following  figures,  this  portion  of  the  analysis,  at  least  in  the  Fulcrum 
Technique,  is  heavily  loaded  with  "search"  and  "hunt"  procedures.  In 
this  area  of  the  translation  procedure,  one  would  expect  to  find  heavily 
used  associative  memory  type  algorithms  and  a  different  word  size  and 
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memory  module  size  indicated  from  that  of  the  earlier  portions  of 
the  program. 

Semantics 

The  portion  of  all  machine  translation  programs  which  have 
been  most  impervious  to  solution  is  that  represented  by  correctly 
erasing  all  but  one  of  the  possible  English  translations  for  a  given 
foreign  word.  Some  groups  have  even  been  dismayed  enough  to 
indicate  that  human  intervention  will  always  have  to  be  required  at 
this  point.  Our  own  research  has  led  us  to  the  use  of  techniques 
which  with  hindsight  would  appear  to  have  called  for  a  cascaded 
search.  Among  the  problem  types  are  homograph  resolution, 
idiom  identification,  word  combination  recognition,  and  determiner  - 
determinee  paring. 

Idiom  recognition  requires  the  knowledge  that  two  or  more 
words,  when  closely  connected  syntactically,  possibly  even  con¬ 
tiguous,  have  a  quite  different  meaning  and  possibly  a  different 
grammatic  aspect  than  would  be  expected  from  the  sum  of  the 
parts  of  the  idiom.  Here  techniques  that  have  been  used  include 
the  dictionary  labelling  of  each  member  of  the  idiom  as  being  a 
potential  idiom  and  then,  when  a  conjunction  of  possible  members 
occurs,  searching  in  a  special  idiom  table.  Even  the  necessary 
identification  of  the  idiom  potentiality  requires  search  procedures. 
The  interaction  between  idiom  identification  and  associative  mem¬ 
ory  computer  configurations  should  be  studied. 

Homograph  Resolution 

Many  words  share  the  same  spelling.  A  dramatic  example  in 
English  are  the  three  different  meanings  of  the  spelling  "fast.  "  As 
a  verb,  it  means  to  not  eat;  as  an  adjective,  it  contains  the  two 
opposite  meanings  of  tied  tight  and  rapid.  Resolution  of  this  type  of 
ambiguity  is  often  made  by  searching  in  the  neighborhood  of  the  word 
for  indications  of  the  word  class  of  the  homograph.  Such  resolution 
would  be  more  easily  effectuated  with  a  search  memory. 
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Word  Combinations 


Word  combinations  represent  an  idiom  type  whose  configuration 
is  both  lexical  and  syntactic;  that  is,  the  correct  translation  of  a 
primary  word  or  of  associated  prepositions  is  determined  by  the 
existence  of  the  primary  word  and  a  particular  type  of  syntactic  en¬ 
vironment.  This  environment  can  be  represented,  for  example,  by 
a  word  in  the  same  clause  with  governed  genitive  and  dative  depend¬ 
encies,  but  with  an  explicitly  missing  accusative  object.  Since  indi¬ 
cations  of  the  limits  of  this  environment  are  only  vaguely  fixed, 
search  is  again  called  for. 

Determiner-Determinee 

More  than  any  other  type  of  semantic  resolution  described 
thus  far,  determiner- determinee  resolution  requires  associative 
memory  techniques  as  can  be  seen  in  the  flow  charts  for  the  method. 
Here  the  resolution  problem  is  represented  by  the  existence  within 
wide  syntactic  boundaries  of  two  or  more  semantically  connected 
words.  The  "hunt"  initially  consists  of  ascertaining  whether  a 
determinee  exists.  Since  determinees  are  capable  of  having  dif¬ 
ferent  determiners,  the  next  aspect  of  the  search  directs  one  to 
"hunting"  first  the  identification  of  the  possible  determiners,  next 
the  establishment  of  their  coexistence  and  then,  finally,  verification 
of  whether  they  exist  in  proper  syntactic  relationship  to  the  deter¬ 
minee.  Oeterminer-determinee  resolution  has  been  awkward  to 
program  on  a  standard  computer.  Its  resolution  should  be  far  more 
powerful  when  associative  memories  are  used. 

Output 

A  consumer  of  automatic  translation  thinks  of  a  printed  page 
as  being  the  output.  This  is  only  partially  true.  In  truth,  the  printed 
translation  is  the  fruit  of  the  process.  The  necessary  evolutionary 
improvements  receive  their  sustenance  from  the  research  output. 
Since  nature  of  handling  the  research  output  almost  invariably  calls 
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for  the  identification  of  particular  translation  configurations  in  a 
large  body  of  text,  this  most  necessary  area  of  automatic  translation 
would  seem  to  benefit  greatly  from  any  hardware  development  which 
eases  the  burdens  of  search.  Here  one  would  study  the  proper  way 
of  emitting  the  research  output  so  that  it  would  be  best  married  to 
the  coming  hardware  realities. 

Dictionary  Lookup 

The  problem  of  dictionary  lookup  in  RW  machine  translation 
has  two  major  aspects.  Most  Russian  words  found  in  the  text  that  we 
process  actually  exist  in  our  dictionary.  A  sizeable  percentage  of 
the  words,  however,  are  represented  in  the  dictionary  with  a  different 
ending.  By  stripping  off  the  possible  suffixes  of  text  words  which  are 
not  found  directly  by  lookup  and  by  then  searching  on  stems,  the 
semantic  and  morphological  aspects  of  these  words  can  be  derived. 
The  use  of  a  search  type  memory  in  this  portion  of  the  problem  is 
obvious. 
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DICTIONARY  UPDATING 


Our  initial  machine  glossary  has  been  updated  in  eight  separate 
runs.  This  added  6,909  new  forms  to  the  dictionary,  which  now  con-- 
tains  23,713  forms.  Since  our  stem-affixing  procedures  allow  us  to 
grammar -code  new  forms  from  different  forms  of  the  same  stem 
found  in  the  dictionary,  this  will  give  us  a  coverage  of  about  50,000 
Russian  forms. 
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STRUGGLE  FOP  THE  CONSTRUCTION  OF  COMMUNISM  YES  PROPtRS 


FLOW  CHARTS  OF 

DETERMINER-DETERMINEE  METHOD 


64 


Start 


1 


PST 


I 


Store  zeros  in  AWNC  and 
AWNC1 


Save  indexes,  store  zer¬ 
os  in  a  block  of  100  ad¬ 
dresses 


I 

Bump  address  *  .  yes  - 

(to  next  vord)— — ->Tesl  out  of  sentence  . *  >rxaS 


i 

iM 


Is  the  current  vord  a 
determinee  (is  "B"  the 
second  character  of  the 
semantic  code)?  n  1  — 


no 


jyes 

Store  the  address  con¬ 
taining  the  vord  number 
of  determinee  in  next 
empty  address  of  100 
vord  block 


l 


Is  block  filled? 


no 


PLFC 


1 


Is  current  vord  a  deter¬ 
miner  (is  "C"  the  third 
character  of  the  seman 
tic  code  vord)?  - 


no 


|yes 

Is  "AWNC"  filled? 
I  no 


Store  the  address  con¬ 
taining  the  vord  number 
of  determiner,  in  AWNC 


1 


->FLFC 


- >PTSF* 


- >  PAWNC 


*  Not  yet  vritten 
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PAWNC 


Store  the  address  con¬ 
taining  the  word  number 
of  determiner  in  AWNC1 

PH#PB 


PTES 


Is  first  address  of  100 
word  block  containing 
addresses  of  word  num¬ 
bers  of  determinees 

filled?  - 

|yes 

Is  AWNC  filled?  - 

lyes 

Are  the  contents  of 
first  address  of  100 
word  block  and  AWNC 

equal?  - 

lyes 

Is  tne  second  address 
of  the  100  word  block 

filled?  - 

Ino 

Is  AWNC1  filled?  - 

_  |n° 

ENT* 


♦  ENT* 

♦  ekt* 


?eB  ->ras 


PBS 


,1 


Pick  up  next  entry  In 
100  word  block  and  pick 
up  word  number  referred 
to  there 

Is  word  number  less 
than  the  word  number  of 
the  first  entry  In  the 

determinee  table?  — ■■■  - . 

Ino 

Is  tne  word  number  the 
as  the  first  entry 


jres_ 


♦  PIfIT* 


In  the  determinee  table? 


Inc 

the 


PLPM 


determi 
Inc 
Is  tne 


Is  the  word  number  more 
than  last  entry  in  the 
detenninee  table?  — . - 

Lno 

word  number  the 
same  as  the  last  entry 
In  the  detenninee 

table?  - 

_ ^no 

PCLN 


2*2- 


4PNIT* 


jres_ 


■»PRFM 


*  Not  yet  written 
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Add  together  the  add¬ 
resses  of  the  top  and 
bottom  parameters  of 
the  table  and  divide 
by  2  (gives  mid-point 
of  table) 

Is  tie  contents  of  this 
mid-point  of  the  table 
more  than  the  vord  num¬ 
ber  of  the  current  de- 

termineeT  — . .  -  — * 

Ino 

Is  tne  contents  of  the 
midpoint  equal  to  the 
word  number  of  the  cur¬ 
rent  determined  - ■ 

ino 

Made  the  midpoint  of 
the  table  the  nev  top 
parameter 


Do  the  top  parameter 
and  the  bottom  paramet¬ 
er  of  the  table  differ 


»MT* 


PLPM 

4 

Save  address  of  match 

4r 

PMPM+2 

PRPM 

X 

Save  address  of  match 


*  Hot  yet  written 


>MaXe  midpoint  of  the 
table  the  new  bottom 
parameter 

pcLii 


PMPM 


PMPM2 


PNP1 


FRP2 


PRP3 


no 


Save'address  of  natch 

i 

-yOo  to  the  determiner 
table**  to  the  address 
indicated  in  the  deter- 
ninee  table 

Pick  up  tests.  Ascer¬ 
tain  vhat  test  applies 

i 

Does  computer  word  ex¬ 
amined  have  an  indica¬ 
tion  that  it  contains 
testsf  .  . . -- 

Does  code  indicate  a 
modifier  determined  by 
a  following  nominal?  — *- — 
|no 

-y Does  code  indicate  a 
nominal  determined  by 
a  following  nominal?  ■ 

Ino 

Does  code  indicate  a 
preposition  determined 
by  a  following  nominal? 

Jno 

->Does  code  indicate  a 
nominal  determined  by 
a  preceding  modifier?  — 

I  no 

-yDoes^code  indicate  a 
genitive  nominal  deter¬ 
mined  by  a  preceding 
nominal?  ■  ■  — - - ■ 


-»PgriP*  (error  In  table) 


-+m 


_jres_ 


-pPPO 


JSL+fefi* 


yes 


yPPM* 


-yPOlf* 


*  Rot  yet  written 

**8ee  page  S  for  the  format  of  the  determiner  table 
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Bump  address 
to  next  word 


m 


Is  vird 


following  cur¬ 
rent  determine*  out  of 
sentence  (skip  adverbs 
and  modifiers)!  . . 


l°o 


i 

Is  Yt>l 


Brd  following  cur¬ 
rent  determlnee  a  nas- 
<n^i  (skip  adverbs  and 
modifiers)!  - - 


no 


Jii  i 


Does* it  hare  any  genl- 
tive  agreement  bits!  - 


no 


-trips 


lyes 

Save  word  number  of  word 
following  determines  as 
potential  determiner 

finished 


n^-{ 


-tP&S 


found 


-»fSl 


70 


PWE 


PWE+1 


i 


♦  Pick  up  next  determiner 
vord  number  in  determin¬ 
er  table 


l 


Have' we  passed  the  last 
determiner  in  the  deter¬ 
miner  table  for  the  de¬ 
termine  on  which  ve  are 
working!  - 


I  no 

Is  the  v 


word  number  of 
the  potential  determiner 
the  same  as  the  word 
number  for  this  entry  in 
the  determiner  table! 


y 

^  r 

the 


yes 

Is  the  test  fulfilled  by 
this  determiner  equal  to 
the  one  which  initiated 
the  search!  - 


1= 


yes 

Place  flags  in  one  or 
more  of  5  consecutive 
locations  indicating 
which  translation  or 
translations  apply 


Found 


exit 


■^25 — ♦  Finished  exit 


— — »  Finished  exit 
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p§e 


sar^t 


Search  for  first  non- 
blank  character  in 
English  field 


P8E5+2 


Is  l4ls  a  character  of 
a  translation  ve  want 
(first  translation, 
etc.— check  flags  set 

in  IWB)t  - 

Lyes 

->Move  to  next  character 

Has  'ihe  6th  computer 
word  of  ttiglish  field 

been  reached?  . . .  ■  ■■■■ 

4  no 


no 


— yWS  (same  logic  as  here 
but  processes  last  £ 
computer  word  of  English 
fiejLd) 

pinflV 


PBE7 

Is  next  character  a  _ 

plusT  - 22 - »FBE5 

lyes 

Indicate  another  trans¬ 
lation  has  been  reached 


Blank  out  character 

if 

PBB5+2 
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CONCLUSIONS  AND  RECOMMENDATIONS 


Semantic  a 

The  determiner-determinee  method  investigated  under  this 
contract  yields  results  for  resolving  some  classes  of  English 
equivalent  ambiguity.  The  research  procedure  used  pursuing  thi* 
study  will  be  fruitful  in  providing  rules  for  choosing  semantically 
appropriate  English  equivalents  for  a  large  list  of  Russian  words. 

It  is  recommended  that  this  technique  be  further  refined  and 
applied  to  a  significant  body  of  fresh  text. 

Syntax 

The  ability  to  rearrange  translations  from  Russian  word 
order  into  idiomatic  English  word  order  is  dependent  upon  the 
correct  determination  of  clause  boundaries.  Research  done  under 
the  present  contract  in  the  identification  of  subject  and  object 
packages  in  conjunction  with  other  clause  boundary  Indicators 
allowed  an  improved  rearrangement  procedure.  It  is  recommended 
that  continued  study  be  applied  to  the  interrelated  problems  of  the 
resolution  of  clause  boundaries,  the  identification  of  the  functional 
attributes  of  major  syntactic  units,  and  relations  between  clauses 
within  the  sentence. 

Diagnostic  Procedure 

The  growing  complexity  of  the  machine  translation  programs 
and  the  recognition  of  the  fact  that  they  are  constantly  being  changed 
for  experimental  purposes  have  highlighted  the  need  for  diagnostic 
procedures.  Systematic  diagnostic  procedures  similar  to  those 
developed  for  computer  checkout  can  be  applied  to  machine  translation 
programs.  Such  a  procedure  can  help  in  isolating  the  causes  of 
deterioration  of  previously  trusted  areas  when  a  change  is  applied 
to  a  supposedly  unrelated  portion  of  the  program.  A  computer 
program  was  developed  under  this  contract  to  perform  this  diag¬ 
nostic  function. 
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Computer  Configuration 

Work  in  the  semantic  area  of  this  contract  highlighted  the 
desirability  of  using  a  search  type  memory  throughout  the 
machine  translation  process. 


74 


